Investigating food production‐associated DNA methylation changes in paleogenomes: Lack of consistent signals beyond technical noise

Abstract The Neolithic transition introduced major diet and lifestyle changes to human populations across continents. Beyond well‐documented bioarcheological and genetic effects, whether these changes also had molecular‐level epigenetic repercussions in past human populations has been an open question. In fact, methylation signatures can be inferred from UDG‐treated ancient DNA through postmortem damage patterns, but with low signal‐to‐noise ratios; it is thus unclear whether published paleogenomes would provide the necessary resolution to discover systematic effects of lifestyle and diet shifts. To address this we compiled UDG‐treated shotgun genomes of 13 pre‐Neolithic hunter‐gatherers (HGs) and 21 Neolithic farmers (NFs) individuals from West and North Eurasia, published by six different laboratories and with coverage c.1×–58× (median = 9×). We used epiPALEOMIX and a Monte Carlo normalization scheme to estimate methylation levels per genome. Our paleomethylome dataset showed expected genome‐wide methylation patterns such as CpG island hypomethylation. However, analyzing the data using various approaches did not yield any systematic signals for subsistence type, genetic sex, or tissue effects. Comparing the HG‐NF methylation differences in our dataset with methylation differences between hunter‐gatherers versus farmers in modern‐day Central Africa also did not yield consistent results. Meanwhile, paleomethylome profiles did cluster strongly by their laboratories of origin. Using larger data volumes, minimizing technical noise and/or using alternative protocols may be necessary for capturing subtle environment‐related biological signals from paleomethylomes.

terns, but with low signal-to-noise ratios; it is thus unclear whether published paleogenomes would provide the necessary resolution to discover systematic effects of lifestyle and diet shifts.To address this we compiled UDG-treated shotgun genomes of 13 pre-Neolithic hunter-gatherers (HGs) and 21 Neolithic farmers (NFs) individuals from West and North Eurasia, published by six different laboratories and with coverage c.1×-58× (median = 9×).We used epiPALEOMIX and a Monte Carlo normalization scheme to estimate methylation levels per genome.Our paleomethylome dataset showed expected genome-wide methylation patterns such as CpG island hypomethylation.However, analyzing the data using various approaches did not yield any systematic signals for subsistence type, genetic sex, or tissue effects.Comparing the HG-NF methylation differences in our dataset with methylation differences between hunter-gatherers versus farmers in modern-day Central Africa also did not yield consistent results.Meanwhile, paleomethylome profiles did cluster strongly by their laboratories of origin.Using larger data volumes, minimizing technical noise and/ or using alternative protocols may be necessary for capturing subtle environmentrelated biological signals from paleomethylomes.

K E Y W O R D S
ancient DNA, DNA methylation, epigenetics, genomics/proteomics, human evolution, Neolithic transition

| INTRODUC TI ON
The last 12,000 years saw diverse human populations shift from mobile hunter-gathering to Neolithic lifeways involving sedentism and food production.These Neolithic transitions not only brought about changes in diet but also major shifts in daily activities, an increase in population density, as well as institutionalized social inequalities (Bar-Yosef & Belfer-Cohen, 1992;Richards, 2002).
Beyond their social impact, how these changes shaped human health, physiology, genetics, and epigenetics has long been debated.Bioarcheological evidence points to negative outcomes related to dietary constraints and high population density, such as increasing prevalence of growth disruption, anemia, or dental caries in Neolithic populations compared to foragers (Larsen, 2006;Latham, 2013).Meanwhile, population genomic studies have reported multiple loci that evolved under positive selection pressures related to agriculture and pastoralism.These include the FADS genes involved in polyunsaturated fatty acid metabolism (Buckley et al., 2017) and the LCT gene responsible for lactase persistence (Tishkoff et al., 2007).Even though these selection pressures appear to have gained strength multiple millennia later than the original transitions to food production (Burger et al., 2020;Mathieson & Mathieson, 2018), their documentation is consistent with the notion that food production had significant long-term impacts on human physiology.
It might be likewise expected that Neolithic transitions shifted human epigenetic profiles.Indeed, changes in overall methylation levels have been found in leukocytes related to vegetable-rich versus fat-and meat-rich diets in a human sample from the United States (Zhang et al., 2011).Even more relevant are the results by Fagny et al. (2015), who compared blood methylation profiles between modern-day rainforest hunter-gatherers (MHGs) and modern-day farmers (MFs) living in central Africa.These authors reported thousands of loci showing differential methylation patterns correlated with both historical and recent shifts in lifestyle.
They further identified convergent epigenetic changes in two pairs of MHG and MF populations and associated these changes with immunity and developmental pathways.These results have raised the question of whether past Neolithic human populations may also have experienced similar lifestyle-and diet-related epigenetic shifts.
Unfortunately, most epigenetic information related to physiology is lost in ancient specimens as soft tissue and RNA are not preserved (see Smith et al., 2019 for an exception).However, it has been shown that cytosine methylation sites can survive in ancient DNA.Several studies have used standard protocols for methylation profiling, such as bisulfite sequencing and immunoprecipitation, on ancient DNA (Llamas et al., 2012;Sawyer et al., 2023;Seguin-Orlando et al., 2015;Smith et al., 2015).Meanwhile, methylation information can be indirectly inferred from sequencing data from ancient DNA molecules treated with the UDG (uracil-DNA glycosylase) enzyme.This is based on the knowledge that after death, aDNA molecules undergo widespread cytosine deamination at their broken ends, resulting in C→U (uracil) transitions if the cytosine is unmethylated, and in C→T (thymine) transitions if the cytosine is methylated (Briggs et al., 2007).Treatment of aDNA with UDG eliminates uracil nucleotides from DNA, and when such UDG-treated aDNA is shotgun sequenced, the level of observed C→T transitions at CpG sites allows inferring the relative methylation level at those loci (Pedersen et al., 2014).
Over the last decade, multiple studies have reported successful retrieval of methylation patterns in past organisms using this approach (reviewed by Orlando et al., 2015).Pedersen et al. (2014) studied 20× coverage UDG-treated genomic data produced from a 4000-year-old hair sample from Greenland.These authors reported significant correlations between genome-wide methylation levels inferred from this data with methylation measured in present-day human tissues, with the highest correlations found with hair.This study also found expected signals of hypomethylation in CpG islands in the paleomethylome data and further inferred the age of the ancient individual using a methylation clock.The same year, studying the 52×-coverage Neanderthal and 30×-coverage Denisovan genomes derived from bone material, Gokhman et al. (2014) found overall low CpG methylation rates (<1.5%) as inferred from postmortem deamination; however, binning those methylation scores yielded high correlations with global methylation patterns measured in modern-day human bone samples.These authors further used this data to predict a number of loci, developmental genes, that might be differentially methylated between archaic hominins and modern humans.Hanghøj et al. (2016) published the epiPALEOMIX MethylMap algorithm for estimating methylation scores in UDGtreated ancient DNA libraries with sufficient (e.g., >2×) coverage.
Applying their algorithm to published ancient human genomes, these authors showed tissue-based clustering among at least some paleomethylomes they analyzed.Successful retrieval of paleomethylation signatures has also been reported for other species, including barley, maize, and horses (Liu et al., 2023;Smith et al., 2014;Wagner et al., 2020).Despite the promising results described above, whether lifestylerelated paleomethylation signatures may be retrievable from ancient bone and tooth material remains unknown.It is also unclear whether paleomethylome profiles inferred from data produced in different laboratories and variable coverages could be readily comparable.This is a particularly challenging task because paleomethylome profiles are inferred indirectly, depending on the presence of random postmortem damage at read ends.The signal-to-noise ratio per locus is hence much lower compared to information collected using bisulfite sequencing on present-day tissue samples.Therefore the technical noise caused by different lab protocols could readily overshadow biological signals.
Here we address these questions by investigating methylation patterns across 34 published paleogenomes from hunter-gatherer (HG) and Neolithic farmer (NF) contexts from West and North Eurasia, produced by different laboratories and with a range of depth-of-coverages.We use the data to test the hypothesis that systematic methylation differences may be detected between ancient HG and NF groups, driven by environmental changes.We further ask whether convergent HG-NF epigenetic shifts can be detected between ancient and present-day populations.

| Genome data selection and preprocessing
We selected UDG/USER-treated shotgun-sequenced genomes from published genomic data including 13 HGs and 21 NFs from West and North Eurasia.Sample-related information can be found in Table S1.
We note that the Siberian Bronze Age individuals were included in the HG category since these groups had an HG-like lifestyle with a diet composed mainly of marine and freshwater products (Kılınç et al., 2021).We chose to limit our sample to West and North Eurasia to limit the effect of differences in population genetic background but also tried to keep our sample large enough to increase power.
All data was downloaded as BAM or FASTQ files from the European Nucleotide Archive (ENA; https:// www.ebi.ac.uk/ ena), with reference numbers listed in Table S1.All FASTQ and BAM files were remapped on Homo sapiens genome assembly hs37d5 using "bwa aln" with parameters "-l 16500 -n 0.01 -o 2" (Li & Durbin, 2009).
We filtered out reads of size less than 35 bps, with a mapping quality (MAPQ) of less than 30, and with more than 10% mismatches to the reference genome.We verified the effectiveness of the UDG/USER treatment by studying the PMD profiles created using "pmdtools" (Skoglund et al., 2014) on each genome (Figures S1 and S2).
We called all CG dinucleotide autosomal positions (n = 26,752,702) from the human (hg19) reference genome using the R Bioconductor package "BSgenome.Hsapiens.UCSC.hg19"(Pagès, 2019) and stored these in a BED file.We then filtered these by removing any positions overlapping with SNP positions from dbSNP 142 (Sherry et al., 2001).Our aim here was to avoid confounding between methylation signals and real variants at CpG positions.There remained 13,270,411 autosomal CpG positions in the reference genome.

| Methylation score calculation
We chose to use the software epiPALEOMIX (Hanghøj et al., 2016) over DamMet (Hanghøj et al., 2019); the latter is an alternative methylome mapping software developed by the same group but is described as requiring ≥20× coverage to generate reliable results.Since our dataset median was much lower we decided to use epiPALEOMIX.epiPALEOMIX requires UDG/USER-treated and ≥2×-coverage genomes (we still included three genomes <2× to increase our sample size).The BAM file, the hg19 reference fasta file, the reference BED file for CpG positions, and the library type of the sample (single-stranded/double-stranded) were given as input.We thus constructed our sample set and epiPALEOMIX input files according to these criteria.
We filtered the epiPALEOMIX output files for each CpG position having ≥4 reads to increase the precision of the methylation score (MS) values.This filtering resulted in an average of 3,006,714 CpG positions per genome (10,721,229).We also ran parallel analyses using ≥10 reads per position (Text S1).
We further generated a file that included the information related to the chromosome number, CpG position, and the MS values of each ancient individual as a column by joining all the files by CpG positions.Missing values were presented by "NA." We calculated average MS values per CpG position per individual from the epiPALEOMIX outputs.Let n 1i denote the number of deaminated reads and n 0i denote the number of non-deaminated reads in genome i.We then calculated: M¯S i = n 1i /(n 0i + n 1i ).We also plotted the MS values per individual (Figure 1b) using "ggplot2" function in R Wickham (2016).We used the R (Wickham, 2016) functions "ggmap" and "ggplot" for plotting geographical distributions and the CpG distributions (Figure 1c).
We performed gene annotation using the UCSC Genome Browser table for the hg19 assembly containing only exons (Karolchik et al., 2004).After that, we calculated MS at the promoter sites (4 kb long) by using 2 kb upstream of the first exon on the positive strand.
We also ran epiPALEOMIX on the X chromosomes (chrX) of the same 34 individuals.These chrX datasets were prepared employing the same steps used with the autosomal datasets.

| Monte Carlo normalization
Given the large differences in mean MS values among the genomes (Figure 1), we normalized our ANOVA dataset, which includes all the reads corresponding to CpG positions per individual, by random subsampling the reads so that every individual in the dataset has mean MS value M¯S = 0.02.Note that here we again only use CpG positions with ≥4 reads in each genome.We chose 0.02 as a target as this was on the lower end of the M¯S distribution among the genomes used.
Let n 1i denote the number of originally deaminated reads in genome i, and let n 0i denote the number of originally nondeaminated reads in the same genome.We proceeded as follows: (a) If genome i had original mean MS < 0.02: we subsampled from n 0i a random subset n 0is as n 0is = 49n 1i , so that n 1i /(n 0is + n 1i ) = 0.02.(b) If genome i had original mean MS > 0.02: we subsampled from n 1i a random subset n 1is as n 1is = n 0i /49, so that n 1is /(n 0i + n 1is ) = 0.02.
We ran random subsampling using the function "sample" offered by R.
We repeated the random subsampling 20 times independently to produce 20 normalized datasets.The chrX dataset was also normalized in the same manner, separately.We note that normalization is performed using all reads (on autosomes, or chrX), not just ones that overlap genes.We also normalized the chrX dataset over the autosomal MSs and plotted violin plots for all CpGs and also for the Neolithic individuals reported by Marchi et al., 2022 using the function "vioplot" in base R (Figure 4).

| Gene methylation datasets
We used these 20 normalized datasets to compile methylation levels per gene, in two ways:  We had 20 parallel subsampled datasets of gene MS values.Note that the numbers of genes and CpG positions in each of these 20 datasets were slightly different because of random sampling of reads (e.g., genes with one CpG position might not be represented in some datasets).
b. Gene-averaged data: This single dataset was produced by calculating, per gene, the means of all CpG MS values and averaging these across the 20 subsampled datasets.We thus summarized the dataset into a matrix of 9956 genes × 34 genomes.

| Statistical tests
We used tests from the R "stats" package.All the tests were carried out two-sided unless otherwise indicated.We adjusted p-values for multiple testing using the Benjamini-Hochberg procedure using the R "p.adjust" function.

| Linear mixed effects models
We applied linear mixed effects models to the full data (a) described above, where multiple CpG positions per gene represent an individual.
Since we had fixed (subsistence type, tissue, and genetic sex) and random factors (individual or laboratory-of-origin) in the settings, we decided to conduct linear mixed-effects models employing the R "stats" package "aov" function (R Core Team, 2020).We tested two models that differed in their random factors for each gene: Here, the response variable "deamination" is a binary [0,1] variable that describes how many reads falling into each gene are deaminated or not.Note that this approach suffers from pseudoreplication because the observations (reads) per locus are dependent when multiple reads map to the same locus.To overcome this, we also used the gene-averaged data (b) described above.This time we applied ANOVA and Kruskal-Wallis tests on MS values per gene but without an individual component, using the R "stats" package "aov" and "kruskal.test"functions, respectively (R Core Team, 2020).Here, we have a single observation per gene, and thus the results do not suffer from pseudoreplication.

| Multidimensional scaling analysis
We carried out multidimensional scaling (MDS) analysis on our gene-averaged dataset which included mean MSs per gene averaged 20 subsampled datasets.We used the R's "cmdscale" function.We ran MDS both including all 34 individuals, or using 32 individuals after excluding extreme outliers Motala12 and K14 (Figure 3, Figure S8).

| Gene Ontology enrichment
Gene Ontology (GO) (Consortium, 2008) enrichment analysis (Subramanian et al., 2005) was performed by comparing gene sets with evidence for significant effects (for subsistence type, tissue type, or genetic sex) that had BH-adjusted p-values < 0.05 from the linear mixed-effects models.We used the R "topGO" (Alexa & Rahnenfuhrer, 2019) and "org.Hs.eg.db" packages (Carlson, 2019) to collect GO Biological Process information for the genes.The background gene sets included all 9657-9660 genes across the 20 normalized datasets included in the analyses.We ran the Fisher's exact test within "topGO," and used its "elim" algorithm for transversing the GO hierarchy (removing genes from significantly enriched lower nodes) (Alexa & Rahnenfuhrer, 2019).We also filtered the output to have ≥5 genes per GO term by using the "nodeSize" option while creating the GO data.The p-value threshold for the significance of the GO terms was chosen to be 0.01.We also visualized resulting GO terms using reviGO with default parameters (Supek et al., 2011).
Results for two randomly chosen datasets (of 20 datasets) are shown in Figures S6 and S7.

| Subsistence type-related methylation differences in ancient Eurasian versus modern African datasets
A recently published study uses blood samples taken from individuals to compare modern-day HG (MHG) and modern-day farmer (MF) blood methylation profiles in West and East African rainforests (Fagny et al., 2015).We used the results file of the study which contained the multiple-testing corrected p-values and the logarithm of methylation fold-change between MFs versus MHGs (logFC).In total, the dataset contained 365,401 CpG positions overlapping 19,672 genes.We used this information to estimate correlations between our results and the modern results reported by the original study.
We tested the co-directionality between the logFC values in this dataset and NF-HG differences we calculated in our methylome dataset.In other words, we compared farmer versus HG differences in MS scores δ MSF−HG across overlapping genes between pairs of datasets.Given the variability of MS profiles among genomes from different laboratories, we performed this comparison using subdatasets from three different laboratories that contained both NF and HG individuals (Boston, Stanford, Mainz; see Table S1), and also using 12 HG and 20 NF genomes excluding Motala12 and K14 individuals.We calculated the Spearman's rank correlation between δ MSF−HG values from two datasets across common genes using the R "stats" package function "cor.test"(R Core Team, 2020).We plotted the lowest regression lines for the main laboratory of origins using the R "graphics" package "pairs" function with the "panel.cor"and "panel.smooth"parameters (Figure 4).The correlations and p-values were calculated using Spearman's rank correlation method.For plotting we used the R "graphics" package and "ggplot2" package functions (Wickham, 2016).
To measure methylation rates, we used c.13 million autosomal CpG positions in the reference genome excluding variable positions (Section 2).In this set, an average of c.9 million (3-12 million) CpG's were covered by at least one read per genome.Filtering for a minimum depth of 4 left us with an average of c.3 million (10,000-12 million) CpG positions per genome.Running epiPALEOMIX (Hanghøj et al., 2016) on this data, we computed the number of likely methylated (deaminated) and possibly non-methylated (non-deaminated) reads, and the resulting methylation score (MS) for each CpG position per genome.The distribution of the MS values per CpG site across all 34 genomes revealed average methylation rates of <7% (Figure 1b).This is much lower than the average CpG methylation rates in human tissues (60%-80%) (Anastasiadi et al., 2018;Smith & Meissner, 2013), but in line with published estimates from other paleogenomes (Gokhman et al., 2014;Hanghøj et al., 2016), and is caused by the indirect nature of methylation level measurements in ancient DNA.We also observed multiple-fold differences in mean MS among the 34 paleogenomes (c.1% vs. c.6%), which likely reflects technical effects rather than biological signals (Tables S1 and S2).
Despite these possibly technical effects, we found that CpG islands (CGIs), which are normally hypomethylated regions of the genome, show significantly lower MS scores (Wilcoxon signed rank test p < 1e-10; Table S3) across these 34 paleogenomes, compared to CGI shores (2 kb from CGI) and CGI shelves (4 kb from CGI) and more distant "open sea" areas (Figure 1c; Figures S3 and S4).This indicates that the genome-wide MS values measured here have some degree of biological relevance.

| Tests for subsistence type, tissue, and sex effects: few or no genes with evidence for systematic methylation differences
We next tested for differentially methylated genes (DMGs) related to subsistence type, tissue of origin (tooth or bone), and genetic sex.Before running the tests, to avoid possible biological effects being confounded by inter-genome variability in average MS values (Figure 1b), we normalized the dataset by randomly subsampling reads for every individual genome so that each genome gained a genome-wide mean MS of 0.02 (Section 2).We performed this subsampling 20 times, creating 20 normalized replicate datasets.Using each of these replicates separately, and for each gene, we ran linear mixed effects models: all MS values across a gene as the response, subsistence type, tissue, and sex as fixed effects, and "individual" as the random effect.
We thus tested c.9600 (9657-9660) genes across the 20 normalized datasets, with each gene represented by a median of 261 CpG positions (1-18,097).Among these genes, a total of 55-71 (0.5%-0.7% of tested genes) had ANOVA p < 0.05 for only subsistence type after Benjamini-Hochberg (BH) correction for multiple testing (the range representing the result across the 20 replicate datasets).The number of BH-corrected significant genes for tissue type and genetic sex were 19-39 (0.2%-0.4%) and 0-12 (0%-0.1%),respectively (Figure S5). Figure 2a shows the top genes identified for each factor.We note that this approach may be overestimating effects due to some degree of pseudoreplication, which we address below (Section 2).
We performed functional enrichment analysis using gene ontology (GO) Biological Process categories to identify possible functional roles of DMGs (those passing BH-corrected ANOVA p < 0.05) relative to the background set of 9657-9660 genes across the 20 subsampled datasets.The most enriched GO terms included development-and regulation-related mechanisms (results for two randomly chosen datasets are shown in Figures S6 and S7).However, the results were not significant after multiple testing correction (BHcorrected Fisher's exact test p > 0.05).
We next repeated the previous analysis but this time using the "laboratory-of-origin" as a random effect (instead of "individual").The numbers of genes with sufficient information to execute ANOVA to compute p-values for all categories were 8867-8891 across the 20 subsampled datasets (Section 2).This time, either no gene or a maximum of two genes were significant at BH-corrected p < 0.05 for any of the three fixed factors.The top genes are shown in Figure 2b; similar to those in Figure 2a no strong effects are visible even among these genes.
Instead of using the full data, summarizing MS values per gene might reduce noise and clarify the signal.For each of the 9955 genes and all 34 individuals, we calculated the average MS across all CpG positions covered with a minimum of 4 reads per gene and averaged these across all 20 subsampled datasets (Section 2).Per individual genome, we observed a median of 9685 genes (mean 7859) with a minimum 1 CpG position covered.Using this dataset, we first calculated Euclidean distances in genome-wide MS scores between all pairs of individuals and summarized these using multi-dimensional scaling (MDS).This revealed that the K14 and Motala12 genomes, which also had the lowest coverage of CpG sites in our set, also behaved as outliers in their paleomethylome profiles (Figure S6).
Removing these two genomes, an MDS plot of distances among the remaining 32 genomes revealed salient clustering by laboratoryof-origin (Figure 3).
We further limited the dataset to 9273 genes observed in a minimum of 20 individuals, and ran Kruskal-Wallis with laboratory-of-origin as an explanatory factor, excluding Motala12 and K14 individuals: we found an effect across 14% of genes tested (BH-corrected p < 0.05).In contrast, running the same test using subsistence type, tissue, or sex as explanatory factors yielded no significant genes at this cutoff.Performing this analysis by limiting the F I G U R E 2 Representative genes with the most significant differential methylation signals in linear mixed model analyses, related to subsistence type, tissue, and sex.The x-axis represents the factors while the y-axis represents the mean MS values per gene per individual.(a) Genes chosen using models with "individual" as random factor.Left panel: ICAM5 (subsistence type p < 0.01).Middle panel: ATPB1 (tissue type p < 0.01).Right panel: CEP135 (genetic sex p = 0.02).(b) Genes chosen using models with "laboratory-of-origin" as random factor.Left panel: TOX2 (subsistence type p = 0.006).Middle panel: PCDHA2 (tissue type p = 0.03).Right panel: RCOR1 (genetic sex p = 0.04).dataset to a minimum of 25 or 30 individuals, using only genomes with ≥10× coverage, or using ANOVA produced qualitatively the same outcomes.
In addition, we repeated the analysis using a cutoff of ≥10 reads per CpG position and 20 genomes with sufficient coverage.
We again found similar results, with 44, 37, and 3 genes with BHcorrected p < 0.05 for subsistence type, tissue type, and genetic sex, respectively, and no significant functional enrichment (Text S1).Our results overall suggest that the biological signals are limited, possibly obscured by the dominant laboratory-of-origin effect in the data.

| No significant correlation with subsistence-type effects in modern-day Africa
Although our analyses above did not yield any clear signs of subsistence-related differential methylation, weak but authentic signals might still be detected by comparing our MS data with subsistence-related DMGs identified in modern-day populations (Fagny et al., 2015), assuming Neolithic shifts would create convergent methylation signatures.We decided to run this comparison (a) on our full dataset of HG-NF differences, and (b) separately on three paleomethylome datasets from different laboratories where both subsistence types were represented (Table 1); we considered that the latter approach might help remove confounding between real signals and technical effects.
To this end, we utilized methylation differences documented between modern-day HGs and agriculturalists in Central Africa, measured in whole blood samples using bisulfite treatment and the Illumina 450K array (Fagny et al., 2015).The authors of this study reported c.9000 and c.6000 genes that included CpG sites differentially methylated between independent groups of traditional HGs and agriculturalists living in Eastern Central Africa (EC Africa) or in Western Central Africa (WC Africa), respectively.
We found 7890 genes overlapping between our paleomethylome dataset and the modern-day African dataset.Across these genes, we calculated the correlation between methylation differences between HG-NF groups in our dataset, and the log-transformed mean fold change [log(FC)] values between modern-day HGs and agriculturalists groups in the EC Africa and WC Africa datasets as calculated by Fagny et al. (2015) (Section 2).We observed a significant positive correlation (Spearman's rank correlation coefficient r = 0.34, p < 0.01; Table 1) between methylation differences in modern-day HGs and agriculturalists measured in EC Africa and WC Africa, in line with the original publication (Fagny et al., 2015).However, no consistent correlation could be observed between HG-NF differences

Note:
The upper triangle panel reports Spearman rank correlation coefficient r, whereas the lower triangle shows p-values."ECAfrica" and "WCAfrica" represent present-day HG-agriculturalist methylation differences (log fold-change) measured in humans from East Central Africa and West Central Africa, respectively."Boston," "Stanford," and "Mainz" stand for MS differences between HG-NF groups measured using only paleogenomes produced in the respective city (Table S1).within our paleomethylome dataset or between differences in the paleogenomes and HG-agriculturalist differences measured either in EC Africa or in WC Africa.Two nominally significant correlations were negative and all correlations were weak (Spearman's r < 0.05) across the 2507 genes shared across all datasets (Table 1).

| Lack of sex-related X chromosome methylation signatures among the 34 paleogenomes
The X chromosome (chrX) is expected to be methylated at higher rates in females compared to males due to female X chromosome inactivation (Liu et al., 2010).Indeed, Liu et al. (2023) recently reported clear clustering of X chromosome MS values measured in ancient female and male horses.To investigate such signal among the 34 paleogenomes used in this study, we prepared a chrX paleomethylome dataset, normalizing the chrX MS scores by the mean autosomal MS score for that individual.We then tested each chrX gene for sex differences using ANOVA, using either "laboratory-of-origin" or "individual" as random factors.Unexpectedly, no chrX gene was significant for sex after the BH correction (p > 0.05).A plot of chrX MS distributions between female and male individuals across all 34 paleogenomes, or only using 13 NF paleogenomes from Mainz, likewise revealed no obvious difference between sexes (Figure 4).This suggests that the overall biological signal in the dataset is indeed limited.

| DISCUSSION
Today, thousands of human paleogenomes are being produced every year and there is growing interest in using these to study biological processes beyond historical and social questions (Orlando et al., 2015).This includes the study of DNA methylation levels.Even though cytosine methylation appears to survive in aDNA (Gokhman et al., 2014;Pedersen et al., 2014;Seguin-Orlando et al., 2015), it has been yet unclear whether the highly variable nature of the published paleogenomic data could allow reproducible signals to be inferred from joint datasets from different laboratories.
Here we investigated biological signals related to tissue source, sex, and subsistence type in a heterogeneous paleomethylome dataset comprising genomes from six different laboratories.We limited the calls to CpG sites with a minimum of 4 reads, normalized the data by subsampling to account for average coverage differences, and ran analyses using several different comparative approaches.Beyond hypomethylation of CGIs, we could not recover any biological signal that reached genome-wide statistical significance.
Whether universal subsistence-type effects related to huntergatherer versus agriculturalist lifeways might be prevalent in bone methylomes is an open hypothesis.Hence, not finding a consistent signal in this dataset may not be surprising and attributable to a diversity of possible effects, including the lack of a real convergent signal, or small sample sizes.However, the lack of tissue (bone vs. tooth) or sex signatures, including on chrX, was unexpected.cent study that reported significant methylation differences among ancient human populations using 1240K capture data and pooling across samples (Barouch et al., 2024); however, it remains possible that batch effects might also have contributed to the detected differences.
In our heterogeneous dataset, the most prominent clustering was by laboratory-of-origin.Such technical effects on methylation scores could be due to differences in mean depth-of-coverage, as well as variable coverage patterns across paleogenomes, which, in turn, could be driven by laboratory protocol differences in aDNA isolation, library preparation, or sequencing.We hypothesize that such technical variability overshadows any differential methylation signals that are subtle and measured indirectly.Hence, based on our results, strict control of technical effects and the wider production of relatively high coverage (e.g., >10×) UDG-treated shotgun genomes (Niiranen et al., 2022) would be highly favorable for future paleomethylome studies.Meanwhile, using alternative experimental strategies could also be an option for paleomethylomics.For instance, Sawyer et al. (2023) recently reported that bisulfite treatment of aDNA coupled with a single-stranded library preparation protocol could produce similar or even more accurate CpG methylation estimates as that inferred indirectly from postmortem damage, using much lower amounts of sequencing data.
Such exploration is highly welcome, as high-resolution paleomethylome information from bone and tooth tissue can have diverse applications.These range from estimating methylation age, physiological/metabolic stress levels, or pathogenic responses, and could be applied to evolutionary studies of humans (Zhur et al., 2021), domestic species (MacHugh et al., 2016), or wild populations.This information, in turn, could help reconstruct past environments and also resolve long-standing questions on the role of epigenetic responses in adaptation to new environments.
a. Full data for linear mixed models: Here, we used all normalized MS values for all CpGs overlapping a gene.Each individual may be represented by multiple CpG positions per gene (median 261).

F
I G U R E 1 The demographic characteristics of the 34 ancient genomes used in this study and their genome-wide methylation scores.(a) The excavation locations of ancient individuals are included in this study.Color coding indicates subsistence type.(b) Left panel: Violin plots of the methylation score (MS) data related to ancient individuals included in this study.The brown and blue points indicate the mean and the median, respectively.The x-axis shows the log2-transformed MS values.The y-axis represents the ancient individuals.Right panel: Zoomed-in version of the left panel.(c) The distribution of mean MS per individual on CpG islands and genomic sites representing shelves, shores, and open seas.

Model 1 :
deamination∼subsistence type + tissue type + genetic sex + Error(individual) Model 2: deamination∼subsistence type + tissue type + genetic sex + Error(laboratory-of-origin) Multi-dimensional scaling (MDS) plots of 32 paleomethylome profiles.The data were created by Monte Carlo normalizing MS values 20 times followed by averaging per gene.(a) MDS plot labeled by subsistence type of individuals.Blue: HG, Red: NF.(b) MDS plot labeled by the laboratory-of-origin (indicated by the city).The Motala12 and K14 genomes were not included in the analyses due to their outlier profiles compared to the rest likely representing technical effects (FigureS8), which leaves us with five laboratories.

TA B L E 1
Pairwise comparisons of HG-agriculturalist DNA methylation differences between ancient and presentday human datasets.F I G U R E 4 Violin plots of X chromosome methylation estimates in female and male ancient genomes.The y-axes show chrX MS values divided by the mean autosomal MS for the same individual.Left Panel: ChrX MS values of all 34 paleogenomes for 12 females (XX) and 22 males (XY).The MS values for females and males were pooled after normalizing by dividing the chrX MS values by the mean autosomal MS value for the same individual.Right Panel: ChrX MS values from NF individuals from Marchi et al. (2022).These genomes were chosen to remove the possible influence of other factors (different laboratory and subsistence type effects) on chrX methylation estimates.
Our negative results appear to contrast with the recent report by Liu et al. (2023) who identified systematic methylation signatures of sex, age, and castration in paleogenomes, or those by Hanghøj et al. (2016) who clustered genomes based on tissue type.However, the first study used >5× coverage genomes produced in the same laboratory, and the second study used data from two laboratories and only >14× genomes.Our results also contrast with another re-